Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Data Brief ; 35: 106852, 2021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-33644273

RESUMO

Ticks from the genus Rhipicephalus have enormous global economic impact as ectoparasites of cattle. Rhipicephalus microplus and Rhipicephalus annulatus are known to harbor infectious pathogens such as Babesia bovis, Babesia bigemina, and Anaplasma marginale. Having reference quality genomes of these ticks would advance research to identify druggable targets for chemical entities with acaricidal activity and refine anti-tick vaccine approaches. We sequenced and assembled the genomes of R. microplus and R. annulatus, using Pacific Biosciences and HiSeq 4000 technologies on very high molecular weight genomic DNA. We used 22 and 29 SMRT cells on the Pacific Biosciences Sequel for R. microplus and R. annulatus, respectively, and 3 lanes of the Illumina HiSeq 4000 platform for each tick. The PacBio sequence yields for R. microplus and R. annulatus were 21.0 and 27.9 million subreads, respectively, which were assembled with Canu v. 1.7. The final Canu assemblies consisted of 92,167 and 57,796 contigs with an average contig length of 39,249 and 69,055 bp for R. microplus and R. annulatus, respectively. Annotated genome quality was assessed by BUSCO analysis to provide quantitative measures for each assembled genome. Over 82% and 92% of the 1066 member BUSCO gene set was found in the assembled genomes of R. microplus and R. annulatus, respectively. For R. microplus, only 189 of the 1066 BUSCO genes were missing and only 140 were present in a fragmented condition. For R. annulatus, only 75 of the BUSCO genes were missing and only 109 were present in a fragmented condition. The raw sequencing reads and the assembled contigs/scaffolds are archived at the National Center for Biotechnology Information.

2.
Data Brief ; 27: 104602, 2019 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-31656838

RESUMO

The longhorned tick, Haemaphysalis longicornis, feeds upon a wide range of bird and mammalian hosts. Mammalian hosts include cattle, deer, sheep, goats, humans, and horses. This tick is known to transmit a number of pathogens causing tick-borne diseases, and was the vector of a recent serious outbreak of oriental theileriosis in New Zealand. A New Zealand-USA consortium was established to sequence, assemble, and annotate the genome of this tick, using ticks obtained from New Zealand's North Island. In New Zealand, the tick is considered exclusively parthenogenetic and this trait was deemed useful for genome assembly. Very high molecular weight genomic DNA was sequenced on the Illumina HiSeq4000 and the long-read Pac Bio Sequel platforms. Twenty-eight SMRT cells produced a total of 21.3 million reads which were assembled with Canu on a reserved supercomputer node with access to 12 TB of RAM, running continuously for over 24 days. The final assembly dataset consisted of 34,211 contigs with an average contig length of 215,205 bp. The quality of the annotated genome was assessed by BUSCO analysis, an approach that provides quantitative measures for the quality of an assembled genome. Over 95% of the BUSCO gene set was found in the assembled genome. Only 48 of the 1066 BUSCO genes were missing and only 9 were present in a fragmented condition. The raw sequencing reads and the assembled contigs/scaffolds are archived at the National Center for Biotechnology Information.

3.
Nat Methods ; 14(11): 1063-1071, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28967888

RESUMO

Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.


Assuntos
Metagenômica , Software , Algoritmos , Benchmarking , Análise de Sequência de DNA
4.
PLoS One ; 11(10): e0165395, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27788220

RESUMO

BACKGROUND: The Cancer Genome Atlas Project (TCGA) is a National Cancer Institute effort to profile at least 500 cases of 20 different tumor types using genomic platforms and to make these data, both raw and processed, available to all researchers. TCGA data are currently over 1.2 Petabyte in size and include whole genome sequence (WGS), whole exome sequence, methylation, RNA expression, proteomic, and clinical datasets. Publicly accessible TCGA data are released through public portals, but many challenges exist in navigating and using data obtained from these sites. We developed TCGA Expedition to support the research community focused on computational methods for cancer research. Data obtained, versioned, and archived using TCGA Expedition supports command line access at high-performance computing facilities as well as some functionality with third party tools. For a subset of TCGA data collected at University of Pittsburgh, we also re-associate TCGA data with de-identified data from the electronic health records. Here we describe the software as well as the architecture of our repository, methods for loading of TCGA data to multiple platforms, and security and regulatory controls that conform to federal best practices. RESULTS: TCGA Expedition software consists of a set of scripts written in Bash, Python and Java that download, extract, harmonize, version and store all TCGA data and metadata. The software generates a versioned, participant- and sample-centered, local TCGA data directory with metadata structures that directly reference the local data files as well as the original data files. The software supports flexible searches of the data via a web portal, user-centric data tracking tools, and data provenance tools. Using this software, we created a collaborative repository, the Pittsburgh Genome Resource Repository (PGRR) that enabled investigators at our institution to work with all TCGA data formats, and to interrogate these data with analysis pipelines, and associated tools. WGS data are especially challenging for individual investigators to use, due to issues with downloading, storage, and processing; having locally accessible WGS BAM files has proven invaluable. CONCLUSION: Our open-source, freely available TCGA Expedition software can be used to create a local collaborative infrastructure for acquiring, managing, and analyzing TCGA data and other large public datasets.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Genômica , Neoplasias/genética , Humanos , Armazenamento e Recuperação da Informação , Software , Interface Usuário-Computador
5.
Sci Rep ; 4: 7081, 2014 Nov 25.
Artigo em Inglês | MEDLINE | ID: mdl-25420880

RESUMO

We present a new transcriptome assembly of the Pacific whiteleg shrimp (Litopenaeus vannamei), the species most farmed for human consumption. Its functional annotation, a substantial improvement over previous ones, is provided freely. RNA-Seq with Illumina HiSeq technology was used to analyze samples extracted from shrimp abdominal muscle, hepatopancreas, gills and pleopods. We used the Trinity and Trinotate software suites for transcriptome assembly and annotation, respectively. The quality of this assembly and the affiliated targeted homology searches greatly enrich the curated transcripts currently available in public databases for this species. Comparison with the model arthropod Daphnia allows some insights into defining characteristics of decapod crustaceans. This large-scale gene discovery gives the broadest depth yet to the annotated transcriptome of this important species and should be of value to ongoing genomics and immunogenetic resistance studies in this shrimp of paramount global economic importance.


Assuntos
Aquicultura , Penaeidae/genética , Penaeidae/metabolismo , Alimentos Marinhos , Transcriptoma , Algoritmos , Animais , Crustáceos/genética , Crustáceos/metabolismo , Replicação do DNA/genética , Daphnia/metabolismo , Bases de Dados Genéticas , Genômica , Sistema Imunitário/metabolismo , Análise de Sequência de RNA , Interface Usuário-Computador
6.
Concurr Comput ; 26(13): 2157-2166, 2014 Sep 10.
Artigo em Inglês | MEDLINE | ID: mdl-25294974

RESUMO

A variety of extremely challenging biological sequence analyses were conducted on the XSEDE large shared memory resource Blacklight, using current bioinformatics tools and encompassing a wide range of scientific applications. These include genomic sequence assembly, very large metagenomic sequence assembly, transcriptome assembly, and sequencing error correction. The data sets used in these analyses included uncategorized fungal species, reference microbial data, very large soil and human gut microbiome sequence data, and primate transcriptomes, composed of both short-read and long-read sequence data. A new parallel command execution program was developed on the Blacklight resource to handle some of these analyses. These results, initially reported previously at XSEDE13 and expanded here, represent significant advances for their respective scientific communities. The breadth and depth of the results achieved demonstrate the ease of use, versatility, and unique capabilities of the Blacklight XSEDE resource for scientific analysis of genomic and transcriptomic sequence data, and the power of these resources, together with XSEDE support, in meeting the most challenging scientific problems.

7.
J Am Med Inform Assoc ; 21(2): 195-9, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-23964072

RESUMO

In the USA, the national cyberinfrastructure refers to a system of research supercomputer and other IT facilities and the high speed networks that connect them. These resources have been heavily leveraged by scientists in disciplines such as high energy physics, astronomy, and climatology, but until recently they have been little used by biomedical researchers. We suggest that many of the 'Big Data' challenges facing the medical informatics community can be efficiently handled using national-scale cyberinfrastructure. Resources such as the Extreme Science and Discovery Environment, the Open Science Grid, and Internet2 provide economical and proven infrastructures for Big Data challenges, but these resources can be difficult to approach. Specialized web portals, support centers, and virtual organizations can be constructed on these resources to meet defined computational challenges, specifically for genomics. We provide examples of how this has been done in basic biology as an illustration for the biomedical informatics community.


Assuntos
Pesquisa Biomédica/organização & administração , Sistemas Computacionais , Disseminação de Informação/métodos , Internet , Genômica , Informática Médica
8.
Nat Protoc ; 8(8): 1494-512, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23845962

RESUMO

De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.


Assuntos
Perfilação da Expressão Gênica/métodos , RNA/química , Software , Transcriptoma , Sequência de Bases , Schizosaccharomyces/genética , Proteínas de Schizosaccharomyces pombe/química , Proteínas de Schizosaccharomyces pombe/genética , Análise de Sequência de RNA/métodos
9.
Biophys J ; 95(4): 1866-76, 2008 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-18469070

RESUMO

N-BAR domains are protein modules that bind to and induce curvature in membranes via a charged concave surface and N-terminal amphipathic helices. Recently, molecular dynamics simulations have demonstrated that the N-BAR domain can induce a strong local curvature that matches the curvature of the BAR domain surface facing the bilayer. Here we present further molecular dynamics simulations that examine in greater detail the roles of the concave surface and amphipathic helices in driving local membrane curvature. We find that the strong curvature induction observed in our previous simulations requires the stable presentation of the charged concave surface to the membrane and is not driven by the membrane-embedded amphipathic helices. Nevertheless, without these amphipathic helices embedded in the membrane, the N-BAR domain does not maintain a close association with the bilayer, and fails to drive membrane curvature. Increasing the membrane negative charge through the addition of PIP(2) facilitates closer association with the membrane in the absence of embedded helices. At sufficiently high concentrations, amphipathic helices embedded in the membrane drive membrane curvature independently of the BAR domain.


Assuntos
Bicamadas Lipídicas/química , Fluidez de Membrana , Proteínas de Membrana/química , Proteínas de Membrana/ultraestrutura , Modelos Químicos , Modelos Moleculares , Fosfolipídeos/química , Simulação por Computador , Conformação Proteica , Estrutura Terciária de Proteína
10.
Biophys J ; 92(10): 3595-602, 2007 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-17325001

RESUMO

Liposome remodeling processes (e.g., vesiculation and tubulation) due to N-BAR domain interactions with the lipid bilayer are explored with a multi-scale simulation approach. Results from atomistic-level molecular dynamics simulations of membrane binding to the concave face of N-BAR domains are used along with discretized mesoscopic field-theoretic simulations to examine how the spontaneous curvature fields generated by N-BAR domains result in membrane remodeling. It is found that tubulation can be generated by anisotropic N-BAR spontaneous curvature fields, whereas vesiculation is only observed with isotropic N-BAR spontaneous curvature fields at high density. The results of the multi-scale simulations provide insight into recent experimental observations.


Assuntos
Bicamadas Lipídicas/química , Lipossomos/química , Fluidez de Membrana , Modelos Químicos , Modelos Moleculares , Proteínas do Tecido Nervoso/química , Proteínas do Tecido Nervoso/ultraestrutura , Sítios de Ligação , Simulação por Computador , Conformação Molecular , Ligação Proteica , Estrutura Terciária de Proteína
11.
Proc Natl Acad Sci U S A ; 103(41): 15068-72, 2006 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-17008407

RESUMO

The process of membrane curvature generation by BAR (Bin/amphiphysin/Rvs) domains is thought to involve the plastering of the negatively charged cell membrane to the positively charged concave surface of the BAR domain. Recent work [Peter, B. J., et al. (2004) Science, 303,495-499; Masuda, M., et al. (2006) EMBO J. 25, 2889-2897; and Gallop, J. L., et al. (2006) EMBO J. 25, 2898-2910] has demonstrated the importance of the charged, crescent-shaped surface and the N-terminal amphipathic helices (present in N-BAR domains) for generating membrane curvature. These experiments suggest that curvature is generated by the synergistic action of the N-terminal helices embedding in the lipid bilayer and the charged crescent-shaped dimer acting to "scaffold" membrane curvature. Here, we present atomistic molecular dynamics simulations that directly show membrane binding to the concave face of N-BAR domains, resulting in the generation of local membrane curvature that matches the curvature presented by the BAR domain. These simulations provide direct molecular-scale evidence that BAR domains create curvature by acting as a scaffold, forcing the membrane to locally adopt the intrinsic shape of the BAR domain. We find that BAR domains bind strongly through the maximum curvature surface and, additionally, at an orientation that presents a lesser degree of curvature, thus enabling N-BAR domains to induce a range of local curvatures. Finally, we find that the N-terminal region may play a role in biasing the orientations of N-BAR domains on the membrane surface to those that favor binding to the concave face and subsequent membrane bending.


Assuntos
Membrana Celular/química , Simulação por Computador , Proteínas do Citoesqueleto/química , Proteínas de Drosophila/química , Modelos Biológicos , Modelos Moleculares , Proteínas do Tecido Nervoso/química , Fatores de Transcrição/química , Animais , Membrana Celular/fisiologia , Cristalografia por Raios X , Proteínas do Citoesqueleto/fisiologia , Proteínas de Drosophila/fisiologia , Fatores de Transcrição Forkhead , Humanos , Bicamadas Lipídicas/química , Proteínas do Tecido Nervoso/fisiologia , Estrutura Terciária de Proteína , Eletricidade Estática , Termodinâmica , Fatores de Transcrição/fisiologia
12.
J Phys Chem B ; 109(39): 18673-9, 2005 Oct 06.
Artigo em Inglês | MEDLINE | ID: mdl-16853402

RESUMO

A nonequilibrium molecular dynamics simulation of the response of dimyristoylphosphatidylcholine (DMPC) bilayers to a solvent shear flow is presented. Application of shear flow to planar, stationary DMPC bilayers results in a redistribution of the membrane density profile along the bilayer normal due to the alignment of the lipids in the direction of flow and an increase in average lipid chain length. An increase in the intermolecular and intramolecular order of the lipids in response to the shear flow is also observed. This study provides groundwork for understanding the mechanism of the full response of lipid bilayers to externally imposed solvent shear flows, beginning with the response in the absence of collective lipid motions such as undulations and bilayer flow.


Assuntos
Bicamadas Lipídicas , Sondas Moleculares
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...